AITopics | mental model

Narratives about artificial intelligence (AI) entangle autonomy, the capacity to self-govern, with sentience, the capacity to sense and feel. AI agents that perform tasks autonomously and companions that recognize and express emotions may activate mental models of autonomy and sentience, respectively, provoking distinct reactions. To examine this possibility, we conducted three pilot studies (N = 374) and four preregistered vignette experiments describing an AI as autonomous, sentient, both, or neither (N = 2,702). Activating a mental model of sentience increased general mind perception (cognition and emotion) and moral consideration more than autonomy, but autonomy increased perceived threat more than sentience. Sentience also increased perceived autonomy more than vice versa. Based on a within-paper meta-analysis, sentience changed reactions more than autonomy on average. By disentangling different mental models of AI, we can study human-AI interaction with more precision to better navigate the detailed design of anthropomorphized AI and prompting interfaces.

machine learning, natural language, sentience, (14 more...)

arXiv.org Artificial Intelligence

2512.09085

Country:

Europe (1.00)
North America > United States > Massachusetts (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Information Technology (1.00)
Government (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(5 more...)

Add feedback

TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork

Mishra, Pranav Pushkar, Arvan, Mohammad, Zalake, Mohan

arXiv.org Artificial IntelligenceDec-4-2025

Building upon Salas et al.'s "Big Five" teamwork model, we operationalize five core components as independently configurable mechanisms: shared mental models, team leadership, team orientation, trust networks, and mutual monitoring. Our architecture dynamically recruits 2-4 specialist agents and employs structured four-phase deliberation with adaptive component selection. Evaluation across eight medical benchmarks encompassing 11,545 questions demonstrates TeamMedAgents achieves 77.63% overall accuracy (text-based: 81.30%, vision-language: 66.60%). Systematic ablation studies comparing three single-agent baselines (Zero-Shot, Few-Shot, CoT) against individual teamwork components reveal task-specific optimization patterns: shared mental models excel on knowledge tasks, trust mechanisms improve differential diagnosis, while comprehensive integration degrades performance. Adaptive component selection yields 2-10 percentage point improvements over strongest baselines, with 96.2% agent convergence validating structured coordination effectiveness. TeamMedAgents establishes principled methodology for translating human teamwork theory into multi-agent systems, demonstrating that evidence-based collaboration patterns enhance AI performance in safety-critical domains through modular component design and selective activation strategies.

artificial intelligence, arxiv preprint arxiv, natural language, (15 more...)

arXiv.org Artificial Intelligence

2508.08115

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Counterfactual Simulatability of LLM Explanations for Generation Tasks

Limpijankit, Marvin, Chen, Yanda, Subbiah, Melanie, Deas, Nicholas, McKeown, Kathleen

arXiv.org Artificial IntelligenceNov-26-2025

LLMs can be unpredictable, as even slight alterations to the prompt can cause the output to change in unexpected ways. Thus, the ability of models to accurately explain their behavior is critical, especially in high-stakes settings. One approach for evaluating explanations is counterfactual simulatability, how well an explanation allows users to infer the model's output on related counterfactuals. Counterfactual simulatability has been previously studied for yes/no question answering tasks. We provide a general framework for extending this method to generation tasks, using news summarization and medical suggestion as example use cases. We find that while LLM explanations do enable users to better predict LLM outputs on counterfactuals in the summarization setting, there is significant room for improvement for medical suggestion. Furthermore, our results suggest that the evaluation for counterfactual simulatability may be more appropriate for skill-based tasks as opposed to knowledge-based tasks.

explanation, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.2174

Country:

North America > United States (1.00)
Asia (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

NoteEx: Interactive Visual Context Manipulation for LLM-Assisted Exploratory Data Analysis in Computational Notebooks

Payandeh, Mohammad Hasan, Yuan, Lin-Ping, Zhao, Jian

arXiv.org Artificial IntelligenceNov-11-2025

Computational notebooks have become popular for Exploratory Data Analysis (EDA), augmented by LLM-based code generation and result interpretation. Effective LLM assistance hinges on selecting informative context -- the minimal set of cells whose code, data, or outputs suffice to answer a prompt. As notebooks grow long and messy, users can lose track of the mental model of their analysis. They thus fail to curate appropriate contexts for LLM tasks, causing frustration and tedious prompt engineering. We conducted a formative study (n=6) that surfaced challenges in LLM context selection and mental model maintenance. Therefore, we introduce NoteEx, a JupyterLab extension that provides a semantic visualization of the EDA workflow, allowing analysts to externalize their mental model, specify analysis dependencies, and enable interactive selection of task-relevant contexts for LLMs. A user study (n=12) against a baseline shows that NoteEx improved mental model retention and context selection, leading to more accurate and relevant LLM responses.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.07223

Country:

North America > United States (1.00)
North America > Canada > Ontario (0.28)
Asia > Japan > Honshū > Kantō (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

How AI Companionship Develops: Evidence from a Longitudinal Study

Hwang, Angel Hsing-Chi, Li, Fiona, Anthis, Jacy Reese, Noh, Hayoun

arXiv.org Artificial IntelligenceOct-14-2025

The quickly growing popularity of AI companions poses risks to mental health, personal wellbeing, and social relationships. Past work has identified many individual factors that can drive human-companion interaction, but we know little about how these factors interact and evolve over time. In Study 1, we surveyed AI companion users (N = 303) to map the psychological pathway from users' mental models of the agent to parasocial experiences, social interaction, and the psychological impact of AI companions. Participants' responses foregrounded multiple interconnected variables (agency, parasocial interaction, and engagement) that shape AI companionship. In Study 2, we conducted a longitudinal study with a subset of participants (N = 110) using a new generic chatbot. Participants' perceptions of the generic chatbot significantly converged to perceptions of their own companions by Week 3. These results suggest a longitudinal model of AI companionship development and demonstrate an empirical method to study human-AI companionship.

artificial intelligence, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2510.10079

Country: